bayesynergy: flexible Bayesian modelling of synergistic interaction effects in in-vitro drug combination experiments

Leiv Rønneberg

05/09/2022


Introduction

The bayesynergy R package implements a Bayesian semi-parametric regression model for estimating the dose-response function of in-vitro drug combination experiments. The Bayesian framework offers full uncertainty quantification of the dose response function and any derived summary statistics, as well as natural handling of replicates and missing data. The Bayesian model is implemented in Stan (Stan Development Team (2020)), taking advantage of the efficient ‘No U-Turn Sampler’ as well as variational Bayes for quick approximations of the true posterior.

The package is further equipped with plotting functions for summarizing a drug response experiment, parallel processing for large drug combination screen, as well as plotting tools for summarizing and comparing these.

The model

The dose-response function \(f:\boldsymbol{x} \to (0,1)\), maps drug concentrations \(\boldsymbol{x}\) to a measure of cell viability – zero corresponding to all cells being dead after treatment, one corresponding to all cells still alive. In drug-combination screens, it is common to assume that the dose-response function can be broken down as \[ f(\boldsymbol{x}) = p_0(\boldsymbol{x})+\Delta(\boldsymbol{x}), \] where \(p_0(\boldsymbol{x})\) encodes a non-interaction assumption, and \(\Delta(\boldsymbol{x})\) captures the residual interaction effect.

Non-interaction

The non-interaction assumption, \(p_0(\boldsymbol{x})\), captures what can be reasonably assumed about a joint drug effect, given estimates of the drugs’ individual effect. We assume a Bliss style independence assumption, where we first assume that the individual drugs’ dose-response function takes the form of a log-logistic curve \[ h_i(x_i|l,s,m) = l + \frac{1-l}{1+10^{s(x_i-m)}}, \] where \(l\) is the lower-asymptote, \(s\) the slope, and \(m\) the drugs ‘EC-50’ on the \(\log_{10}\) scale. The Bliss assumption then amounts to a probabilistic independence assumption, where \[ p_0(\boldsymbol{x}) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \] We call it probabilistic, because we can interpret the individual dose-response curves, \(h_i()\) as probability of cell survival. Defining the events \[ \begin{align} A_i & = \text{A cell survives drug A at concentration $x_{1i}$} \\ B_j & = \text{A cell survives drug B at concentration $x_{2j}$} \\ C_{ij} & = \text{A cell survives both drugs at concentration $\boldsymbol{x}=(x_{1i},x_{2j})$}, \end{align} \] the corresponding probabilities become \[ p_0(\boldsymbol{x}) = P(C_{ij}) = P(A_i)P(B_i) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \]

Interaction

The interaction component, \(\Delta(\boldsymbol{x})\), captures any joint effect of the drugs that is not captured by the non-interaction assumption. If two drugs are more effective together than it would be expected by \(p_0\), we call it synergy, which corresponds to \(\Delta <0\). The opposite effect is deemed antagonism.

Because the interaction landscape can be complex, with multiple local peaks and valleys, we model this term non-parametrically using a Gaussian Process prior (GP). To ensure that the resulting dose-response function only takes values in the interval \((0,1)\), we push the GP through a transformation function \(g()\). That is \[ z(\boldsymbol{x}) \sim \mathcal{GP}(0,\kappa(\boldsymbol{x},\boldsymbol{x}')) \\ \Delta(\boldsymbol{x}) = g(z(\boldsymbol{x})), \] where the transformation function looks like \[ g(z(\boldsymbol{x})) = \frac{-p_0(\boldsymbol{x})}{1+\exp\left\{b_1z(\boldsymbol{x})+\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} + \frac{1-p_0(\boldsymbol{x})}{1+\exp\left\{-b_2z(\boldsymbol{x})-\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}}. \] In addition to ensuring the proper bounds for the dose-response function, this transformation has the feature of \(g(0)=0\), which corresponds to an a priori assumption that \[ \mathbb{E}\left[f(\boldsymbol{x}) | p_0(\boldsymbol{x})\right] \approx p_0(\boldsymbol{x}). \] That is, we make our non-interaction assumption into a formal prior expectation on the dose-response function. This achieves two things, (1) a slightly conservative model that needs to be convinced that interaction effects are present, and (2) no built-in bias of interaction in the prior structure.

The covariance function \(\kappa(\boldsymbol{x},\boldsymbol{x}')\) can be given multiple specifications, including a squared exponential, Matérn, and Rational Quadratic covariance functions. By default, we use a Matérn covariance with the \(\nu\) parameter set to 3/2 yielding \[ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2\left(1+\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right)\exp\left\{-\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right\}. \] Finally, by utilizing the natural grid structure of the drug concentrations, we can write the kernel function as \[ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2 \kappa(x_1,x_1')\kappa(x_2,x_2'), \] which induces a Kronecker product structure on the final covariance matrix. Following the implementation detailed in Flaxman et al. (2015), this greatly improves the computational efficiency of the model.

The observation model

Given the above formulation for the dose-response function \(f\), we assume that we have access to noisy observations from it. These observations are typically generated from various cellular assays, e.g. viability assays. In particular we assume that given concentration points \(\boldsymbol{x}_1,\ldots,\boldsymbol{x}_n\) we have observations \(y_1,\ldots,y_n\) where \[ y_i = f(\boldsymbol{x}_i) + \epsilon_i, \] where we assume that the errors \(\epsilon_i\) are normally distributed with mean zero. For the variance of the observational errors, by default we model these in a heteroscedastic fashion as \[ \text{Var}\left[\epsilon_i\right] = \sigma^2(f(\boldsymbol{x}_i)+\lambda), \] where \(\lambda\) is set to a small value to handle the case when \(f = 0\), but there is still some residual noise. In a typical setup where cell viability is calculated through a normalization to positive and negative controls, lambda can be empirically set as \[ \lambda = \frac{\sigma^2_{+}}{\sigma^2_{-}}, \] where \(\sigma^2_{+}\) and \(\sigma^2_{-}\) denotes the variance of positive and negative controls, respectively.

We choose a heteroscedastic model by default, because in cell viability assays, the observations are normalized in relation to positive and negative controls. The positive controls typically have much lower variance compared to the negative controls, which translates to viability measures closer to zero being more precisely measured. We also allow homoscedastic noise as an option.

Including controls

The positive and negative controls essentially control the signal-to-noise ratio in cell viability assays. If the user has access to these, they can be included in the model to help calibrate the posterior distribution – particularly in the case with zero replicates.

Let \(\xi^-_k\) and \(\xi^+_l\) denote the negative and positive controls for \(k=1,\ldots,n_-\) and \(l=1,\ldots,n_+\). These measurements are raw readings from the plate and are used to calculate cell viability. For an additional well, treated with drug concentration \(\mathbf{x}_i\), we denote the raw output by \(\xi_i\), and calculate cell viability for this well by the formula: \[ y_i = \frac{\xi_i-\tilde{\xi^+}}{\tilde{\xi^-}-\tilde{\xi^+}}, \] where \(\tilde{\xi^-}\) and \(\tilde{\xi^+}\) denotes some measure of centrality of the positive and negative controls, typically the mean or median.

The controls can themselves be passed through this function and converted to % viability. From the variances of these normalized controls, \(\lambda\) can be set as indicated above. And the negative controls can be added directly into the algorithm. Negative controls represents unhindered cell growth, and can be thought of as samples from the dose-response function \(f(\mathbf{x})\) at concentration \(\mathbf{x}=(0,0)\). These can then be added directly to the \(\texttt{bayesynergy}\) function in the same way as regular observations.

Full model specification

The full model specification, with all default prior distributions look like \[ y_i \sim \mathcal{N}\left(f(\boldsymbol{x}_i),\sigma^2(f(\boldsymbol{x}_i)+\lambda)\right), \ i = 1,\ldots, n \\ \sigma \sim \text{Inv-Ga}\left(5,1\right), \ \lambda = 0.005. \\ f(\boldsymbol{x}_i) = p_0(\boldsymbol{x}_i)+\Delta(\boldsymbol{x}_i) \mathbb{I}(10^{\boldsymbol{x}_i}>0) \\ p_0(\boldsymbol{x}) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \\ l_j = \text{Beta}(1,1.25), \ s_i \sim \text{Gamma}(1,1), \\ m_i \sim \mathcal{N}(\theta_i,\sigma_{m_i}^2), \ j = 1,2 \\ \theta_i \sim \mathcal{N}(0,1), \ \sigma_{m_i}^2 \sim \text{Inv-Ga}\left(3,2\right), \ j = 1,2 \\ \Delta(\boldsymbol{x}) = g(z(\boldsymbol{x})), \ z(\boldsymbol{x}) \sim \mathcal{GP}(0,\kappa(\boldsymbol{x},\boldsymbol{x}')) \\ g(z(\boldsymbol{x})) = \frac{-p_0(\boldsymbol{x})}{1+\exp\left\{b_1z(\boldsymbol{x})+\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} + \frac{1-p_0(\boldsymbol{x})}{1+\exp\left\{-b_2z(\boldsymbol{x})-\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} \\ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2\left(1+\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right)\exp\left\{-\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right\}, \\ \sigma_f^2 \sim \text{log-}\mathcal{N}(1,1), \ \ell \sim \text{Inv-Ga}(5,5) \\ b_j \sim \mathcal{N}(1,0.1^2), \ j = 1,2. \] Note that some of these specifications can be altered. For example, by default we estimate the lower asymptotes, but they can also be fixed equal to zero.

In the model specification above, the interaction term is multiplied with an indicator function \(\mathbb{I}(\boldsymbol{x}>0)\) taking the value 1 if and only if all elements in \(\boldsymbol{x}\) is strictly larger than zero. This makes sure that we don’t allow for interaction when one of the drugs is at zero concentration.

Summary measures

From the posterior dose-response function \(f | \mathbf{y}\), we derive a number of summary statistics concerning efficacy, synergy and antagonism.

Monotherapy summaries

For the monotherapy curves, we produce estimates of the drug sensitivity score (DSS) of each drug by the integral

\[ DSS_0 = \int_a^b 1-h_j(x) \text{d}x, \] where \(a=\min(x_{1j})\) and \(b=\max(x_{1j})\). That is, the integral is taken from the measured dose range of the drug in question. This is in contrast to how the regular DSS score is calculated, where integration starts where the mono-therapy crosses the 90% viability threshold. This is done to better separate true effects from background noise, but since this is handled here through sampling, we don’t need it. The DSS value is further standardized by the total volume available for drug efficacy, \[ DSS = \frac{DSS_0}{(b-a)} \] From here, values can be further standardized as in Yadav et al. (2014).

Combination summaries

To summarise the combined drug-response function, we utilise the measures developed in Cremaschi et al. (2019). The basic building block is the ‘volume under the surface’ or VUS, for which the general integral looks like

\[ VUS_0(f) = \int_a^b \int_c^d f(\mathbf{x}) \ \text{d}\mathbf{x}, \] and the integrals are taken over the observed drug range, i.e. \(a = \min (x_1)\), \(b = \max (x_1)\), \(c = \min (x_2)\), \(d = \max (x_2)\). This is then standardised to obtain a value between zero and 100, \[ VUS(f) = \frac{VUS_0(f)}{(b-a)(d-c)}. \] Furthermore, to make this into an overall measure of efficacy, we define the residual VUS (rVUS) by

\[ rVUS(f) = 100 - VUS(f), \] which makes this value more comparable with the DSS values, where a higher number now indicates a larger efficacy of the drug combination.

The model calculates \(rVUS\) for the dose-response function \(f\), giving a measure of combined efficacy. In addition, we calculate \(rVUS(p_0)\), the non-interaction efficacy. This makes it possible to separate how much of the total efficacy that can be attributed to the non-interaction assumption. For the interaction term, we simply compute the VUS values e.g. \(VUS(\Delta)\) for the interaction efficacy. For the interaction term \(\Delta\), we also compute \(VUS(\Delta^{-})\) and \(VUS(\Delta^{+})\) for synergy and antagonism, where \(\Delta^{+}\) and \(\Delta^{-}\) denotes the positive and negative parts of \(\Delta\), respectively. That is,

\[ \Delta^{+}(\mathbf{x}) = \max(0,\Delta(\mathbf{x})) \\ \Delta^{-}(\mathbf{x}) = \min(0,\Delta(\mathbf{x})). \] We compute these measures because, frequently, the interaction surface contains both antagonistic and synergistic regions. When taking the average across the whole surface, an antagonistic outlier might cancel an otherwise strong synergistic effect.

Summarising large screens

When running screens with a large amount of drug combinations, it is helpful to have a normalised measure for comparing synergy across experiments. The \(rVUS\) scores defined above are already standardized to their drug concentration range, but to compare across experiments, we also standardize with respect to the uncertainty in the model. To do this, we calculate a synergy score by normalizing \(rVUS(\Delta^{-})\) with respect to its standard deviation. \[ \text{Synergy score} = \frac{\text{mean}(VUS(\Delta^{-}))}{\text{sd}(VUS(\Delta^{-}))}. \]

Synergy classification

Frequently, it is of interest to classify an experiment as synergistic or antagonistic. Usually, this has been done by thresholding the synergy measure at a certain level, declaring e.g. everything above 10 as synergistic, everything below -10 antagonistic, and anything in between as additive (no interaction). The problem with this is that it completely ignores the underlying measurement error, and as a consequence the thresholding procedure can lead to misclassification. Large synergistic effects might be classified as synergistic, but in reality the effect cannot be discerned from the background noise. In the same manner, genuine synergistic effects that are too small, for example because the dose-ranges are a bit off, will also be misclassified. By incorporating the uncertainty into the classification it can be done in a more principled manner.

In Bayesian inference, we can compute what is know as the model evidence. That is, given a probabilistic model \(\mathcal{M}\), and some data we think is generated from it, \(\mathcal{D}\), the evidence is defined as the probability of the model given the data, \(P(\mathcal{M} \vert \mathcal{D})\). We can use this quantity to compare different models, in particular when comparing two distinct models we can define the Bayes Factor, \(\text{BF}_{10}\): \[ \text{BF}_{10}=\frac{P(\mathcal{D}\vert\mathcal{M}_1)}{P(\mathcal{D}\vert\mathcal{M}_0)} = \frac{P(\mathcal{M}_1 \vert \mathcal{D})}{P(\mathcal{M}_0 \vert \mathcal{D})}\frac{P(\mathcal{M}_1)}{P(\mathcal{M}_0)}, \] where \(P(\mathcal{M}_1)\) and \(P(\mathcal{M}_0)\) denotes the prior model probabilities. By defining \[ \mathcal{M}_0: f(\mathbf{x}) = p_0(\mathbf{x}) \\ \mathcal{M}_1: f(\mathbf{x}) = p_0(\mathbf{x}) + \Delta(\mathbf{x}), \] and computing \(\text{BF}_{10}\), the Bayes factor gives information on whether the interaction surface needs to be included in the model. A high value indicates that \(\mathcal{M}_1\) is preferred over \(\mathcal{M}_0\), and thus that there most likely is some interaction in the experiment. One still needs to make a cutoff, but it will be less arbitrary by connecting it directly to the uncertainty in the model, and model evidence. The thresholding itself can be done according to e.g. the table in Kass and Raftery (1995):

\(\text{BF}_{10}\) Evidence against \(\mathcal{M}_0\)
1 to 3.2 Not worth more than a bare mention
3.2 to 10 Substantial
10 to 100 Strong
>100 Decisive

The Bayes factor only gives information about whether or not an interaction is present. Depending on the classification task, one still needs to decide if the effect is synergistic or antagonistic. For this one could e.g. use the integral of the interaction surface, \(\text{VUS}(\Delta)\), if this is negative the experiment is coded as synergistic, if positive it is coded as antagonistic.

The calculation of the Bayes factor is implemented directly in the bayesynergy function, and can be calculated simply by adding bayes_factor = T to the call. Model evidence and the Bayes factor itself is computed via the bridgesampling package (Gronau, Singmann, and Wagenmakers (2020)).

A simple example – a single experiment

In the R package, we’ve attached two example datasets from a large drug combination screening experiment on diffuse large B-cell lymphoma. We’ll use these to show some simple use cases of the main functions and how to interpret the results.

Let’s load in the first example and have a look at it

library(bayesynergy)
data("mathews_DLBCL")
y = mathews_DLBCL[[1]][[1]]
x = mathews_DLBCL[[1]][[2]]
head(cbind(y,x))
##      Viability ibrutinib ispinesib
## [1,] 1.2295618    0.0000         0
## [2,] 1.0376006    0.1954         0
## [3,] 1.1813851    0.7812         0
## [4,] 0.5882688    3.1250         0
## [5,] 0.4666700   12.5000         0
## [6,] 0.2869514   50.0000         0

We see that the the measured viability scores are stored in the vector y, while x is a matrix with two columns giving the corresponding concentrations where the viability scores were read off.

Fitting the regression model is simple enough, and can be done on default settings simply by running the following code (where we add the names of the drugs involved, the concentration units for plotting purposes, and calculate the bayes factor).

fit = bayesynergy(y,x, drug_names = c("ibrutinib", "ispinesib"),
                  units = c("nM","nM"),bayes_factor = T)
## 
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000163 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.63 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 2.82966 seconds (Warm-up)
## Chain 1:                2.6061 seconds (Sampling)
## Chain 1:                5.43576 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 4.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.48 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 2.77396 seconds (Warm-up)
## Chain 2:                1.70938 seconds (Sampling)
## Chain 2:                4.48334 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 7.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.76 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 3.49013 seconds (Warm-up)
## Chain 3:                2.74635 seconds (Sampling)
## Chain 3:                6.23648 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 6.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.61 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 2.72047 seconds (Warm-up)
## Chain 4:                2.0383 seconds (Sampling)
## Chain 4:                4.75876 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'nointeraction' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 4.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.42 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.376352 seconds (Warm-up)
## Chain 1:                0.242506 seconds (Sampling)
## Chain 1:                0.618858 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'nointeraction' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.38833 seconds (Warm-up)
## Chain 2:                0.294654 seconds (Sampling)
## Chain 2:                0.682984 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'nointeraction' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.387425 seconds (Warm-up)
## Chain 3:                0.294108 seconds (Sampling)
## Chain 3:                0.681533 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'nointeraction' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.396576 seconds (Warm-up)
## Chain 4:                0.4342 seconds (Sampling)
## Chain 4:                0.830776 seconds (Total)
## Chain 4:
## Calculating the Bayes factor

The resulting model can be summarised by running

summary(fit)
##                 mean  se_mean     sd      2.5%       50%  97.5%  n_eff  Rhat
## la_1[1]       0.3348 0.002131 0.0741  1.57e-01  3.45e-01  0.455 1207.5 1.001
## la_2[1]       0.3844 0.007323 0.0710  9.53e-02  3.97e-01  0.455   94.1 1.037
## log10_ec50_1  0.4807 0.004462 0.1564  2.39e-01  4.50e-01  0.863 1229.2 1.000
## log10_ec50_2 -1.0006 0.022637 1.0138 -3.31e+00 -8.51e-01  0.504 2005.6 1.002
## slope_1       2.0134 0.020333 0.8961  8.78e-01  1.81e+00  4.296 1942.3 1.000
## slope_2       1.4893 0.034577 1.1058  8.24e-02  1.23e+00  4.314 1022.8 1.002
## ell           3.0452 0.034864 1.4762  1.28e+00  2.72e+00  6.681 1792.9 1.001
## sigma_f       0.8431 0.016682 0.7972  1.71e-01  6.11e-01  2.915 2283.8 1.000
## s             0.0966 0.000264 0.0147  7.32e-02  9.46e-02  0.130 3111.4 1.001
## dss_1        33.4492 0.044352 2.9510  2.75e+01  3.35e+01 39.134 4426.9 1.000
## dss_2        59.4242 0.044896 2.8232  5.38e+01  5.95e+01 64.692 3954.2 1.000
## rVUS_f       82.7605 0.013293 0.8630  8.10e+01  8.28e+01 84.473 4214.9 1.000
## rVUS_p0      73.0108 0.033918 2.2650  6.85e+01  7.31e+01 77.171 4459.5 0.999
## VUS_Delta    -9.7497 0.034910 2.4208 -1.47e+01 -9.66e+00 -5.313 4808.5 1.000
## VUS_syn      -9.7978 0.034389 2.3759 -1.47e+01 -9.68e+00 -5.481 4773.5 1.000
## VUS_ant       0.0481 0.001926 0.1124  5.10e-06  8.22e-05  0.402 3404.8 1.000
## 
## log-Pseudo Marginal Likelihood (LPML) =  52.53337 
## Estimated Bayes factor in favor of full model over non-interaction only model:  27.48539

which gives posterior summaries of the parameters of the model. In addition, the model calculates summary statistics of the monotherapy curves and the dose-response surface including drug sensitivity scores (DSS) for the two drugs in question, as well as the volumes that capture the notion of efficacy (rVUS_f), interaction (VUS_Delta), synergy (VUS_syn) and interaction (VUS_ant).

As indicated, the total combined drug efficacy is around 80% (rVUS_f), of which around 70 percentage points can be attributed to \(p_0\) (rVUS_p0), leaving room for 10 percentage points worth of synergy (VUS_syn). We can also note that the model is fairly certain of this effect, with a 95% credible interval given as (-14.703, -5.481). The certainty of this is also verified by the Bayes factor, which at 27.49 indicates strong evidence of an interaction effect present in the model.

We can also create plots by simply running

plot(fit, plot3D = F)

which produces monotherapy curves, monotherapy summary statistics, 2D contour plots of the dose-response function \(f\), the non-interaction assumption \(p_0\) and the interaction \(\Delta\). The last plot displays the \(rVUS\) scores as discussed previously, with corresponding uncertainty.

The package can also generate 3D interactive plots by setting plot3D = T. These are displayed as following using the plotly library (Plotly Technologies Inc. (2015)).

A simple example – a synergy screen

The synergyscreen provides a work flow for data from big drug combination screens, where multiple drugs are tested in combination on multiple cell lines. It takes as input a list of experiments, each entry being a list containing the necessary elements needed for a call to the main regression function bayesynergy.

Included in the package is the result of a synergyscreen run of 583 drug combinations on the A-375 human melanoma cell line from ONeil et al. (2016). The synergyscreen object is a list with two entries, a dataframe with parameter estimates from each experiment, and a list entitled failed – containing experiments that either failed completely to process, or had an unsatisfactory fit.

data("ONeil_A375")
length(ONeil_A375$failed)
## [1] 2

We see that the dataset has two experiments that failed to process, during an initial run of synergyscreen. There’s a multitude of reasons why an experiment might fail to process, it could be an input error, initialization problems or problems with the parallel processing.

The entries of failed are themselves lists, each containing the necessary information to process through the bayesynergy function

failed_experiment = ONeil_A375$failed[[1]]
names(failed_experiment)
## [1] "y"             "x"             "drug_names"    "experiment_ID"
head(cbind(failed_experiment$y,failed_experiment$x))
##      viability L778123 MK-4827
## [1,]     0.759   0.325   0.223
## [2,]     0.755   0.325   0.775
## [3,]     0.548   0.325   2.750
## [4,]     0.307   0.325  10.000
## [5,]     0.787   0.800   0.223
## [6,]     0.820   0.800   0.775

We can rerun experiments that failed to process, by simply passing the returned synergyscreen object back into the function. Note that we turn of the default options of saving each fit and plotting everything, and set method = "vb" indicating we use variational inference to fit the model.

fit_screen = synergyscreen(ONeil_A375, save_raw = F, save_plots = F, parallel = F, 
                           bayesynergy_params = list(method = "vb"))
## Chain 1: ------------------------------------------------------------
## Chain 1: EXPERIMENTAL ALGORITHM:
## Chain 1:   This procedure has not been thoroughly tested and may be unstable
## Chain 1:   or buggy. The interface is subject to change.
## Chain 1: ------------------------------------------------------------
## Chain 1: 
## Chain 1: 
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000387 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 3.87 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Begin eta adaptation.
## Chain 1: Iteration:   1 / 250 [  0%]  (Adaptation)
## Chain 1: Iteration:  50 / 250 [ 20%]  (Adaptation)
## Chain 1: Iteration: 100 / 250 [ 40%]  (Adaptation)
## Chain 1: Iteration: 150 / 250 [ 60%]  (Adaptation)
## Chain 1: Iteration: 200 / 250 [ 80%]  (Adaptation)
## Chain 1: Iteration: 250 / 250 [100%]  (Adaptation)
## Chain 1: Success! Found best value [eta = 0.1].
## Chain 1: 
## Chain 1: Begin stochastic gradient ascent.
## Chain 1:   iter             ELBO   delta_ELBO_mean   delta_ELBO_med   notes 
## Chain 1:    100         -137.216             1.000            1.000
## Chain 1:    200          -90.691             0.757            1.000
## Chain 1:    300          -31.833             1.121            1.000
## Chain 1:    400          -16.813             1.064            1.000
## Chain 1:    500            8.048             1.469            1.000
## Chain 1:    600           29.445             1.345            1.000
## Chain 1:    700           44.126             1.201            0.893
## Chain 1:    800           52.628             1.071            0.893
## Chain 1:    900           62.217             0.969            0.727
## Chain 1:   1000           79.661             0.894            0.727
## Chain 1:   1100           79.555             0.794            0.513   MAY BE DIVERGING... INSPECT ELBO
## Chain 1:   1200           90.880             0.755            0.333   MAY BE DIVERGING... INSPECT ELBO
## Chain 1:   1300           95.238             0.575            0.219   MAY BE DIVERGING... INSPECT ELBO
## Chain 1:   1400          101.273             0.491            0.162
## Chain 1:   1500          110.030             0.190            0.154
## Chain 1:   1600          119.706             0.126            0.125
## Chain 1:   1700          129.962             0.101            0.081
## Chain 1:   1800          135.372             0.088            0.080
## Chain 1:   1900          142.390             0.078            0.079
## Chain 1:   2000          145.955             0.058            0.060
## Chain 1:   2100          150.294             0.061            0.060
## Chain 1:   2200          150.411             0.049            0.049
## Chain 1:   2300          153.116             0.046            0.049
## Chain 1:   2400          154.017             0.041            0.040
## Chain 1:   2500          157.363             0.035            0.029
## Chain 1:   2600          158.324             0.027            0.024
## Chain 1:   2700          156.021             0.021            0.021
## Chain 1:   2800          159.446             0.019            0.021
## Chain 1:   2900          156.957             0.016            0.018
## Chain 1:   3000          159.206             0.015            0.016
## Chain 1:   3100          158.777             0.012            0.015
## Chain 1:   3200          160.953             0.013            0.015
## Chain 1:   3300          159.286             0.013            0.014
## Chain 1:   3400          162.648             0.014            0.015
## Chain 1:   3500          161.060             0.013            0.014
## Chain 1:   3600          162.884             0.013            0.014
## Chain 1:   3700          163.358             0.012            0.014
## Chain 1:   3800          162.458             0.011            0.011
## Chain 1:   3900          162.783             0.009            0.010   MEAN ELBO CONVERGED
## Chain 1: 
## Chain 1: Drawing a sample of size 1000 from the approximate posterior... 
## Chain 1: COMPLETED.
## Chain 1: ------------------------------------------------------------
## Chain 1: EXPERIMENTAL ALGORITHM:
## Chain 1:   This procedure has not been thoroughly tested and may be unstable
## Chain 1:   or buggy. The interface is subject to change.
## Chain 1: ------------------------------------------------------------
## Chain 1: 
## Chain 1: 
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000203 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.03 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Begin eta adaptation.
## Chain 1: Iteration:   1 / 250 [  0%]  (Adaptation)
## Chain 1: Iteration:  50 / 250 [ 20%]  (Adaptation)
## Chain 1: Iteration: 100 / 250 [ 40%]  (Adaptation)
## Chain 1: Iteration: 150 / 250 [ 60%]  (Adaptation)
## Chain 1: Iteration: 200 / 250 [ 80%]  (Adaptation)
## Chain 1: Iteration: 250 / 250 [100%]  (Adaptation)
## Chain 1: Success! Found best value [eta = 0.1].
## Chain 1: 
## Chain 1: Begin stochastic gradient ascent.
## Chain 1:   iter             ELBO   delta_ELBO_mean   delta_ELBO_med   notes 
## Chain 1:    100          -19.251             1.000            1.000
## Chain 1:    200           65.855             1.146            1.292
## Chain 1:    300           81.429             0.828            1.000
## Chain 1:    400           99.174             0.666            1.000
## Chain 1:    500          100.229             0.535            0.191
## Chain 1:    600          112.051             0.463            0.191
## Chain 1:    700          115.495             0.401            0.179
## Chain 1:    800          114.997             0.352            0.179
## Chain 1:    900          118.868             0.316            0.106
## Chain 1:   1000           69.986             0.354            0.179
## Chain 1:   1100          117.826             0.295            0.179
## Chain 1:   1200          124.654             0.171            0.106
## Chain 1:   1300          116.066             0.159            0.074
## Chain 1:   1400          127.829             0.151            0.074
## Chain 1:   1500          133.973             0.154            0.074
## Chain 1:   1600          108.241             0.168            0.074
## Chain 1:   1700          132.456             0.183            0.092
## Chain 1:   1800          132.884             0.183            0.092
## Chain 1:   1900          132.526             0.180            0.092
## Chain 1:   2000          136.678             0.113            0.074
## Chain 1:   2100          134.877             0.074            0.055
## Chain 1:   2200          137.507             0.070            0.046
## Chain 1:   2300          137.841             0.063            0.030
## Chain 1:   2400          141.118             0.056            0.023
## Chain 1:   2500          138.944             0.053            0.019
## Chain 1:   2600          139.658             0.030            0.016
## Chain 1:   2700          142.104             0.013            0.016
## Chain 1:   2800          137.320             0.016            0.017
## Chain 1:   2900          146.252             0.022            0.019
## Chain 1:   3000          140.450             0.023            0.019
## Chain 1:   3100          143.051             0.024            0.019
## Chain 1:   3200          145.838             0.024            0.019
## Chain 1:   3300          145.853             0.024            0.019
## Chain 1:   3400          148.668             0.023            0.019
## Chain 1:   3500          140.638             0.027            0.019
## Chain 1:   3600          146.250             0.031            0.035
## Chain 1:   3700          144.928             0.030            0.035
## Chain 1:   3800          140.715             0.029            0.030
## Chain 1:   3900          143.588             0.025            0.020
## Chain 1:   4000          147.081             0.023            0.020
## Chain 1:   4100          142.744             0.025            0.024
## Chain 1:   4200          147.745             0.026            0.030
## Chain 1:   4300          150.514             0.028            0.030
## Chain 1:   4400          148.121             0.028            0.030
## Chain 1:   4500          146.775             0.023            0.024
## Chain 1:   4600          118.622             0.043            0.024
## Chain 1:   4700          151.074             0.063            0.030
## Chain 1:   4800          148.788             0.062            0.024
## Chain 1:   4900          145.882             0.062            0.024
## Chain 1:   5000          148.964             0.062            0.021
## Chain 1:   5100          149.325             0.059            0.020
## Chain 1:   5200          148.311             0.056            0.018
## Chain 1:   5300          150.114             0.055            0.016
## Chain 1:   5400          149.399             0.054            0.015
## Chain 1:   5500          149.456             0.053            0.015
## Chain 1:   5600          146.410             0.032            0.015
## Chain 1:   5700          148.144             0.011            0.012
## Chain 1:   5800          145.647             0.012            0.012
## Chain 1:   5900          150.855             0.013            0.012
## Chain 1:   6000          152.168             0.012            0.012
## Chain 1:   6100          144.875             0.017            0.012
## Chain 1:   6200          147.519             0.018            0.017
## Chain 1:   6300          150.208             0.018            0.018
## Chain 1:   6400          150.932             0.018            0.018
## Chain 1:   6500          151.201             0.019            0.018
## Chain 1:   6600          149.939             0.017            0.017
## Chain 1:   6700          146.972             0.018            0.018
## Chain 1:   6800          146.157             0.017            0.018
## Chain 1:   6900          150.142             0.016            0.018
## Chain 1:   7000          150.781             0.016            0.018
## Chain 1:   7100          149.732             0.011            0.008   MEDIAN ELBO CONVERGED
## Chain 1: 
## Chain 1: Drawing a sample of size 1000 from the approximate posterior... 
## Chain 1: COMPLETED.

We can also plot the result of the screen:

plot(fit_screen)

Diagnosing errors and warnings

Sometimes, the bayesynergy function may return with a warning. Ideally, we don’t want any warnings at all, and they should be examined closely, as posterior samples could be unreliable. Usually, the warning will tell the user how to fix the problem at hand, e.g. by running the chains for longer (set iter higher), or setting adapt_delta higher. See [https://mc-stan.org/misc/warnings.html] for some general tips.

Divergent Transitions

Most commonly, the sampler might complain about divergent transitions. The warning will typically look like this:

## Warning: There were 2316 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.

This is indicative of a posterior geometry that is tricky to explore. In the case where there is only a few divergent transitions, the usual trick is to set adapt_delta to a higher value, i.e. larger than 0.9 which is the default. This can be done through the control option in the bayesynergy call:

fit = bayesynergy(y, x, control = list(adapt_delta = 0.99))

However, the case above, where there are 2316 divergent transitions, is indicative of a misspecified model. In my experience, this can happen for a few reasons.

Robustness against outliers

We provide a version of the model that is more robust against inevitable outliers in dose=response data. This is done by swapping out two components of the model formulation, the likelihood and the prior for kernel hyperparameters.

For the likelihood, we utilise the log-Pareto-tailed Normal (LPTN) distribution with mean \(\mu\) and standard deviation \(\sigma\), as described in Gagnon, Desgagné, and Bédard (2020) which takes the form \[ f(x)= \begin{cases} \frac{1}{\sigma}\phi\left(\frac{x-\mu}{\sigma}\right) & \text{if } \vert\frac{x-\mu}{\sigma}\vert \leq \tau \\ \frac{1}{\sigma}\phi(\tau)\frac{\tau}{\vert\frac{x-\mu}{\sigma}\vert}\left(\frac{\log (\tau)}{\log(\vert\frac{x-\mu}{\sigma}\vert)}\right)^{\lambda+1} & \text{if } \vert\frac{x-\mu}{\sigma}\vert > \tau \end{cases} \] where \(\phi\) denotes the standard normal pdf. The parameters \((\tau,\lambda)\) are controlled by the user-specified hyperparameter \(\rho \in (2\Phi(1)-1,1)\) as \[ \tau=\Phi^{-1}((1+\rho)/2), \ \ \ \ \lambda=2(1-\rho)^{-1}\phi(\tau)\tau\log(\tau) \]

The robust likelihood can be utilized by setting robust = T when calling thebayesynergy function. The user-specified parameter \(\rho\) can be set by the rho argument, by default set to \(0.9\).

For the kernel, we recommend utilising the Matérn kernel with a Penalized Complexity (PC) prior on the kernel hyperparameters. The PC prior for the Matern covariance was described in Fuglstad et al. (2018), and strongly encourages small deviations from the null model, i.e. high probability of an interaction term being zero. The PC prior for the kernel hyperparameters \((\sigma_f^2,\ell)\) (in the 2-dimensional case) takes the form \[ \pi(\sigma_f^2,\ell)=\tilde{\lambda}_1\tilde{\lambda}_2\ell^{-2}\sigma_f^{-1}\exp\left(-\tilde{\lambda}_1\ell^{-1}-\tilde{\lambda}_2\sigma_f\right), \] where \((\tilde{\lambda}_1,\tilde{\lambda}_2)\) are set to reflect prior beliefs \(P(\ell < \ell_0)=\alpha_1\) and \(P(\sigma_f^2 > \sigma^2_{f0})=\alpha_2\) by \[ \tilde{\lambda}_1=-\log(-\alpha_1)\ell \ \ \ \ \tilde{\lambda}_2=-\frac{\log(\alpha_2)}{\sigma_{f0}^2}, \] where \((\ell_0,\alpha_1,\sigma_{f0}^2,\alpha_2)\) are hyperparameters that are set by the user. The PC prior can be enabled by setting pcprior = T when calling the bayesynergy function, and hyperparameters specified by the argument pcprior_hypers. By default, we recommend \((\ell_0,\alpha_1,\sigma_{f0}^2,\alpha_2)=(1,0.1,1,0.2)\).

References

Cremaschi, Andrea, Arnoldo Frigessi, Kjetil Taskén, and Manuela Zucknick. 2019. “A Bayesian Approach to Study Synergistic Interaction Effects in in-Vitro Drug Combination Experiments.” https://arxiv.org/abs/1904.04901.
Flaxman, Seth, Andrew Gelman, Daniel Neill, Alex Smola, Aki Vehtari, and Andrew Gordon Wilson. 2015. “Fast Hierarchical Gaussian Processes.” Manuscript in Preparation.
Fuglstad, Geir-Arne, Daniel Simpson, Finn Lindgren, and Håvard Rue. 2018. “Constructing Priors That Penalize the Complexity of Gaussian Random Fields.” Journal of the American Statistical Association 114 (525): 445–52. https://doi.org/10.1080/01621459.2017.1415907.
Gagnon, Philippe, Alain Desgagné, and Mylène Bédard. 2020. “A New Bayesian Approach to Robustness Against Outliers in Linear Regression.” Bayesian Analysis 15 (2). https://doi.org/10.1214/19-ba1157.
Gronau, Quentin F., Henrik Singmann, and Eric-Jan Wagenmakers. 2020. “Bridgesampling: An r Package for Estimating Normalizing Constants.” Journal of Statistical Software 92 (10). https://doi.org/10.18637/jss.v092.i10.
Kass, Robert E., and Adrian E. Raftery. 1995. “Bayes Factors.” Journal of the American Statistical Association 90 (430): 773–95. https://doi.org/10.1080/01621459.1995.10476572.
ONeil, Jennifer, Yair Benita, Igor Feldman, Melissa Chenard, Brian Roberts, Yaping Liu, Jing Li, et al. 2016. “An Unbiased Oncology Compound Screen to Identify Novel Combination Strategies.” Molecular Cancer Therapeutics 15 (6): 1155–62. https://doi.org/10.1158/1535-7163.mct-15-0843.
Plotly Technologies Inc. 2015. “Collaborative Data Science.” Montreal, QC: Plotly Technologies Inc. 2015. https://plot.ly.
Stan Development Team. 2020. RStan: The R Interface to Stan.” http://mc-stan.org/.
Yadav, Bhagwan, Tea Pemovska, Agnieszka Szwajda, Evgeny Kulesskiy, Mika Kontro, Riikka Karjalainen, Muntasir Mamun Majumder, et al. 2014. “Quantitative Scoring of Differential Drug Sensitivity for Individually Optimized Anticancer Therapies.” Scientific Reports 4 (1). https://doi.org/10.1038/srep05193.